@dataknut)Please note that authorship is alphabetical. Contributions are listed below - see github for details and who to blame for what :-).
@dataknut)If you wish to refer to any of the material from this report please cite as:
Report circulation:
Report purpose:
official Southampton City Council Air Quality data (http://southampton.my-air.uk)This work is (c) 2019 the University of Southampton.
Data downloaded from http://southampton.my-air.uk. See also https://www.southampton.gov.uk/environmental-issues/pollution/air-quality/.
Load previously processed data…
dataPath <- path.expand("~/Data/SCC/airQual/")
# merge them all
files <- list.files(paste0(dataPath, "/processed/"), pattern = "*.gz", full.names = TRUE)
l <- lapply(files, data.table::fread)
dt <- rbindlist(l, fill = TRUE)
skimr::skim(dt)
## Skim summary statistics
## n obs: 104844
## n variables: 10
##
## ── Variable type:character ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n min max empty n_unique
## MeasurementDateGMT 0 104844 104844 16 16 0 17474
## site 0 104844 104844 22 31 0 6
##
## ── Variable type:logical ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n mean count
## co 104844 0 104844 NaN 104844
##
## ── Variable type:numeric ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n mean sd p0 p25 p50 p75 p100
## nox 44412 60432 104844 22.37 33.09 -5 4.2 11.5 27.3 492.3
## nox2 44417 60427 104844 34.16 20.03 0 19.1 30.9 45.5 174.7
## noxes 68605 36239 104844 71.05 64.01 1.5 32.3 54.9 87.8 904.8
## oz 88642 16202 104844 40.86 23.99 -0.2 23.7 41.3 56.5 174.1
## pm10 74617 30227 104844 18.02 10.97 -1.5 11 15.5 22.2 252.5
## pm2_5 89429 15415 104844 12.98 9.41 -4 7.3 10.3 15.6 239.1
## so2 89058 15786 104844 4.01 3.87 -1.1 1.2 2.4 5.8 49.6
## hist
## ▇▁▁▁▁▁▁▁
## ▆▇▃▁▁▁▁▁
## ▇▁▁▁▁▁▁▁
## ▆▇▇▂▁▁▁▁
## ▇▁▁▁▁▁▁▁
## ▇▁▁▁▁▁▁▁
## ▇▂▁▁▁▁▁▁
dt[, `:=`(obsDateTime, lubridate::ymd_hm(MeasurementDateGMT))]
t <- dt[, .(`co: Carbon Monoxide, mg/m3` = mean(co, na.rm = TRUE), `nox = Nitric Oxide, ug/m3` = mean(nox,
na.rm = TRUE), `nox2 = Nitrogen Dioxide, ug/m3` = mean(nox2, na.rm = TRUE),
`noxes = Oxides of Nitrogen, ug/m3` = mean(noxes, na.rm = TRUE), `oz = ozone, ug/m3` = mean(oz,
na.rm = TRUE), `pm10, ug/m3` = mean(pm10, na.rm = TRUE), `pm2_5, ug/m3` = mean(pm2_5,
na.rm = TRUE), `so2 = Sulphur Dioxide, ug/m3` = mean(so2, na.rm = TRUE)),
keyby = .(site)]
kableExtra::kable(t, caption = "Mean values per site (NaN indicates not measured)") %>%
kable_styling()
| site | co: Carbon Monoxide, mg/m3 | nox = Nitric Oxide, ug/m3 | nox2 = Nitrogen Dioxide, ug/m3 | noxes = Oxides of Nitrogen, ug/m3 | oz = ozone, ug/m3 | pm10, ug/m3 | pm2_5, ug/m3 | so2 = Sulphur Dioxide, ug/m3 |
|---|---|---|---|---|---|---|---|---|
| Southampton - A33 Roadside AURN | NaN | 26.49725 | 32.27554 | NaN | NaN | 16.89069 | NaN | NaN |
| Southampton - Bitterne | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Southampton - Onslow Road | NaN | 25.89773 | 39.91412 | 79.62518 | NaN | NaN | NaN | NaN |
| Southampton - Redbridge | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Southampton - Victoria Road | NaN | 25.23369 | 36.80428 | 75.49788 | NaN | NaN | NaN | NaN |
| Southampton Background AURN | NaN | 12.60205 | 28.37414 | 47.85700 | 40.86236 | 19.30788 | 12.97722 | 4.009033 |
Table 3.1 gives an indication of the availability of the different measures.
t <- dt[, .(mean = mean(nox, na.rm = TRUE), sd = sd(nox, na.rm = TRUE), min = min(nox,
na.rm = TRUE), max = max(nox, na.rm = TRUE)), keyby = .(site)]
kableExtra::kable(t, caption = "Summary of nox data") %>% kable_styling()
| site | mean | sd | min | max |
|---|---|---|---|---|
| Southampton - A33 Roadside AURN | 26.49725 | 36.26582 | 0.0 | 404.9 |
| Southampton - Bitterne | NaN | NA | Inf | -Inf |
| Southampton - Onslow Road | 25.89773 | 33.59185 | -2.9 | 444.1 |
| Southampton - Redbridge | NaN | NA | Inf | -Inf |
| Southampton - Victoria Road | 25.23369 | 34.93944 | -5.0 | 492.3 |
| Southampton Background AURN | 12.60205 | 24.85957 | 0.1 | 395.3 |
Table 4.1 suggests that there may be a few (150) negative values. These are summarised in 4.2.
t <- head(dt[nox < 0], 10)
kableExtra::kable(t, caption = "Negative nox values (first 10)") %>% kable_styling()
| MeasurementDateGMT | nox | nox2 | noxes | pm10 | site | co | oz | pm2_5 | so2 | obsDateTime |
|---|---|---|---|---|---|---|---|---|---|---|
| 2018-01-29 03:00 | -0.4 | 16.7 | 16.1 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-01-29 03:00:00 |
| 2018-05-26 22:00 | -0.1 | 13.5 | 13.3 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-05-26 22:00:00 |
| 2018-06-08 02:00 | -0.6 | 14.7 | 13.9 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-06-08 02:00:00 |
| 2018-06-10 21:00 | -0.2 | 29.5 | 29.3 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-06-10 21:00:00 |
| 2018-06-11 19:00 | -0.1 | 13.8 | 13.5 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-06-11 19:00:00 |
| 2018-06-14 03:00 | -0.1 | 7.6 | 7.5 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-06-14 03:00:00 |
| 2018-06-27 19:00 | -0.1 | 19.0 | 18.8 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-06-27 19:00:00 |
| 2018-06-27 21:00 | -0.2 | 26.5 | 26.2 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-06-27 21:00:00 |
| 2018-07-03 22:00 | -0.1 | 20.4 | 20.3 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-07-03 22:00:00 |
| 2018-07-21 02:00 | -0.3 | 14.6 | 14.2 | NA | Southampton - Onslow Road | NA | NA | NA | NA | 2018-07-21 02:00:00 |
t <- table(dt[nox < 0]$site)
kableExtra::kable(t, caption = "Negative nox values (count by site)") %>% kable_styling()
| Var1 | Freq |
|---|---|
| Southampton - Onslow Road | 113 |
| Southampton - Victoria Road | 37 |
p <- ggplot2::ggplot(dt, aes(x = obsDateTime, y = site, fill = nox)) + geom_tile() +
scale_fill_continuous(low = "green", high = "red") + labs(x = "Time")
p
Figure 4.1: nox data availability
Figure ?? shows hourly values for all sites.
p <- ggplot2::ggplot(dt, aes(x = obsDateTime, y = nox, colour = site)) + geom_line()
p <- p + theme(legend.position = "bottom")
plotly::ggplotly(p) # interactive